22 research outputs found

    Bayesian model predictive control: Efficient model exploration and regret bounds using posterior sampling

    Full text link
    Tight performance specifications in combination with operational constraints make model predictive control (MPC) the method of choice in various industries. As the performance of an MPC controller depends on a sufficiently accurate objective and prediction model of the process, a significant effort in the MPC design procedure is dedicated to modeling and identification. Driven by the increasing amount of available system data and advances in the field of machine learning, data-driven MPC techniques have been developed to facilitate the MPC controller design. While these methods are able to leverage available data, they typically do not provide principled mechanisms to automatically trade off exploitation of available data and exploration to improve and update the objective and prediction model. To this end, we present a learning-based MPC formulation using posterior sampling techniques, which provides finite-time regret bounds on the learning performance while being simple to implement using off-the-shelf MPC software and algorithms. The performance analysis of the method is based on posterior sampling theory and its practical efficiency is illustrated using a numerical example of a highly nonlinear dynamical car-trailer system

    Performance and safety of Bayesian model predictive control: Scalable model-based RL with guarantees

    Full text link
    Despite the success of reinforcement learning (RL) in various research fields, relatively few algorithms have been applied to industrial control applications. The reason for this unexplored potential is partly related to the significant required tuning effort, large numbers of required learning episodes, i.e. experiments, and the limited availability of RL methods that can address high dimensional and safety-critical dynamical systems with continuous state and action spaces. By building on model predictive control (MPC) concepts, we propose a cautious model-based reinforcement learning algorithm to mitigate these limitations. While the underlying policy of the approach can be efficiently implemented in the form of a standard MPC controller, data-efficient learning is achieved through posterior sampling techniques. We provide a rigorous performance analysis of the resulting `Bayesian MPC' algorithm by establishing Lipschitz continuity of the corresponding future reward function and bound the expected number of unsafe learning episodes using an exact penalty soft-constrained MPC formulation. The efficiency and scalability of the method are illustrated using a 100-dimensional server cooling example and a nonlinear 10-dimensional drone example by comparing the performance against nominal posterior MPC, which is commonly used for data-driven control of constrained dynamical systems

    Learning-based Moving Horizon Estimation through Differentiable Convex Optimization Layers

    Full text link
    To control a dynamical system it is essential to obtain an accurate estimate of the current system state based on uncertain sensor measurements and existing system knowledge. An optimization-based moving horizon estimation (MHE) approach uses a dynamical model of the system, and further allows for integration of physical constraints on system states and uncertainties, to obtain a trajectory of state estimates. In this work, we address the problem of state estimation in the case of constrained linear systems with parametric uncertainty. The proposed approach makes use of differentiable convex optimization layers to formulate an MHE state estimator for systems with uncertain parameters. This formulation allows us to obtain the gradient of a squared and regularized output error, based on sensor measurements and state estimates, with respect to the current belief of the unknown system parameters. The parameters within the MHE problem can then be updated online using stochastic gradient descent (SGD) to improve the performance of the MHE. In a numerical example of estimating temperatures of a group of manufacturing machines, we show the performance of tuning the unknown system parameters and the benefits of integrating physical state constraints in the MHE formulation.Comment: This paper was accepted for presentation at the 4th Annual Conference on Learning for Dynamics and Control. The extended version here contains an additional appendix with more details on the numerical exampl

    A predictive safety filter for learning-based racing control

    Full text link
    The growing need for high-performance controllers in safety-critical applications like autonomous driving has been motivating the development of formal safety verification techniques. In this paper, we design and implement a predictive safety filter that is able to maintain vehicle safety with respect to track boundaries when paired alongside any potentially unsafe control signal, such as those found in learning-based methods. A model predictive control (MPC) framework is used to create a minimally invasive algorithm that certifies whether a desired control input is safe and can be applied to the vehicle, or that provides an alternate input to keep the vehicle in bounds. To this end, we provide a principled procedure to compute a safe and invariant set for nonlinear dynamic bicycle models using efficient convex approximation techniques. To fully support an aggressive racing performance without conservative safety interventions, the safe set is extended in real-time through predictive control backup trajectories. Applications for assisted manual driving and deep imitation learning on a miniature remote-controlled vehicle demonstrate the safety filter's ability to ensure vehicle safety during aggressive maneuvers

    Approximate Predictive Control Barrier Functions using Neural Networks: A Computationally Cheap and Permissive Safety Filter

    Full text link
    A predictive control barrier function (PCBF) based safety filter allows for verifying arbitrary control inputs with respect to future constraint satisfaction. The approach relies on the solution of two optimization problems computing the minimal constraint relaxations given the current state, and then computing the minimal deviation from a proposed input such that the relaxed constraints are satisfied. This paper presents an approximation procedure that uses a neural network to approximate the optimal value function of the first optimization problem from samples, such that the computation becomes independent of the prediction horizon. It is shown that this approximation guarantees that states converge to a neighborhood of the implicitly defined safe set of the original problem, where system constraints can be satisfied for all times forward. The convergence result relies on a novel class K\mathcal{K} lower bound on the PCBF decrease and depends on the approximation error of the neural network. Lastly, we demonstrate our approach in simulation for an autonomous driving example and show that the proposed approximation leads to a significant decrease in computation time compared to the original approach.Comment: Submitted to ECC2
    corecore